Skip to content

Updated:(deps): Bump html-to-markdown-rs from 2.23.1 to 2.27.2 in /src#51

Merged
Sewer56 merged 1 commit intomainfrom
dependabot/cargo/src/html-to-markdown-rs-2.27.2
Mar 4, 2026
Merged

Updated:(deps): Bump html-to-markdown-rs from 2.23.1 to 2.27.2 in /src#51
Sewer56 merged 1 commit intomainfrom
dependabot/cargo/src/html-to-markdown-rs-2.27.2

Conversation

@dependabot
Copy link
Copy Markdown
Contributor

@dependabot dependabot Bot commented on behalf of github Mar 2, 2026

Bumps html-to-markdown-rs from 2.23.1 to 2.27.2.

Release notes

Sourced from html-to-markdown-rs's releases.

v2.27.2

Fixed

  • Plain text list items missing markers: <ul> and <ol> list items in OutputFormat::Plain were output without any bullet or number prefix. Now emits - for unordered lists and sequential N. for ordered lists, respecting the start attribute on <ol>.

v2.27.0 - Plain Text Output Format

Added

  • Plain text output format — Set output_format to "plain" to strip all markup and return only visible text. This bypasses the full Markdown/Djot conversion pipeline for maximum speed. Useful for search indexing, text extraction, and feeding content to LLMs.

Usage

// Rust
let options = ConversionOptions { output_format: OutputFormat::Plain, ..Default::default() };
let plain = convert_html_with_options(html, &options)?;
# Python
from html_to_markdown import convert
plain = convert(html, output_format="plain")
// TypeScript / Node.js
import { convert } from "@kreuzberg/html-to-markdown-node";
const plain = convert(html, { outputFormat: "plain" });

Available across all 10+ language bindings: Rust, Python, Node.js, WASM, CLI, Ruby, PHP, Java, C#, Elixir, R, and Go.

Full Changelog: kreuzberg-dev/html-to-markdown@v2.26.3...v2.27.0

v2.26.3

Fixed

  • Subscript/superscript content silently dropped: When sub_symbol or sup_symbol was empty (the default), text inside <sub> and <sup> tags was discarded entirely — e.g. H<sub>2</sub>O produced HO instead of H2O.
  • Missing whitespace between newline-separated inline elements: Whitespace-only text nodes containing newlines between adjacent inline elements (e.g. <a>…</a>\n<a>…</a>) were dropped, causing links and other inline markup to merge without a word boundary. Now collapses to a single space per HTML white-space normalization rules.

v2.26.2

Fixed

  • Inconsistent whitespace before inline elements across paragraphs: Fixed a stateful bug where \n before <a>, <strong>, <em>, and other inline elements inside <p> tags was handled differently depending on the paragraph's position in the document. The second and subsequent paragraphs would drop the space before inline elements, producing text[link](https://github.com/kreuzberg-dev/html-to-markdown/blob/HEAD/url) instead of text [link](https://github.com/kreuzberg-dev/html-to-markdown/blob/HEAD/url). (Issue #212, thanks @​haroldparis)

Full Changelog: kreuzberg-dev/html-to-markdown@v2.26.1...v2.26.2

v2.26.0

What's Changed

... (truncated)

Changelog

Sourced from html-to-markdown-rs's changelog.

[2.27.2] - 2026-03-02

Fixed

  • Plain text list items missing markers: <ul> and <ol> list items in OutputFormat::Plain were output without any bullet or number prefix. Now emits - for unordered lists and sequential N. for ordered lists, respecting the start attribute on <ol>.

[2.27.1] - 2026-03-01

Fixed

  • Colon introduced into definition list text: <dd> elements inside <dl> were incorrectly prefixed with : (Pandoc definition list syntax), introducing spurious colons into converted text. Standard Markdown and GFM do not support definition list syntax, so <dd> content is now output as plain blocks. (Issue #214, thanks @​smoyerx)
  • Go test app go.sum out of sync: Updated tests/test_apps/go/go.sum to match the v2.27.0 module version, fixing the CI Go lint job.

[2.27.0] - 2026-03-01

Added

  • Plain text output format: New OutputFormat::Plain option that strips all markup and returns only visible text content. Set output_format to "plain" (also accepts "plaintext" or "text"). This fast-path bypasses the full Markdown/Djot conversion pipeline — after DOM parsing, a lightweight text extractor walks the tree collecting only visible text with structural whitespace. Useful for search indexing, text extraction, and feeding content to LLMs.

[2.26.3] - 2026-02-28

Fixed

  • Subscript/superscript content silently dropped: When sub_symbol or sup_symbol was empty (the default), text inside <sub> and <sup> tags was discarded entirely — e.g. H<sub>2</sub>O produced HO instead of H2O.
  • Missing whitespace between newline-separated inline elements: Whitespace-only text nodes containing newlines between adjacent inline elements (e.g. <a>…</a>\n<a>…</a>) were dropped, causing links and other inline markup to merge without a word boundary. Now collapses to a single space per HTML white-space normalization rules.

[2.26.2] - 2026-02-28

Fixed

  • Inconsistent whitespace before inline elements across paragraphs: Fixed a stateful bug where \n before <a>, <strong>, <em>, and other inline elements inside <p> tags was handled differently depending on the paragraph's position in the document. The second and subsequent paragraphs would drop the space before inline elements, producing text[link](https://github.com/kreuzberg-dev/html-to-markdown/blob/main/url) instead of text [link](https://github.com/kreuzberg-dev/html-to-markdown/blob/main/url). (Issue #212, thanks @​haroldparis)

[2.26.1] - 2026-02-27

Fixed

  • YAML frontmatter in convert_with_metadata output: convert_with_metadata no longer prepends YAML frontmatter to the markdown string. Since metadata is returned as a structured ExtendedMetadata object, embedding it in the content string was redundant and polluted the output.

[2.26.0] - 2026-02-26

Added

  • C FFI distribution infrastructure: Distribution-grade C FFI library with CMake/pkg-config integration, installation scripts, and packaging for system-level consumption.
  • C FFI test coverage: Comprehensive C test suite covering conversion, metadata extraction, error handling, visitor pattern, profiling, and version queries.
  • C documentation and examples: C API reference, getting-started snippets, and example programs for basic conversion, metadata extraction, and visitor pattern usage.

Fixed

  • R package r-universe build: Configure scripts now download the source archive from GitHub when the monorepo is unavailable, enabling r-universe and standalone source installs to vendor crates automatically.

... (truncated)

Commits
  • 26a04f0 chore: release v2.27.2
  • 1cfb095 fix: resolve CI failures in validate, Go, and R workflows
  • 7b2ddae fix: remove Pandoc definition list colon prefix from dd elements (#214)
  • 9897a0e fix: resolve CI failures in validate, R, and Node Windows
  • 6626698 feat: add plain text output format
  • 69ebfd1 chore: release v2.26.3
  • e1d0b02 fix: preserve sub/sup content and inline whitespace
  • 5da9df1 chore: updated test apps
  • 9878298 chore: updated deps and cleanup lint issues
  • d46f828 build: pin mkdocs < 2.0 to prevent breaking changes
  • Additional commits viewable in compare view

Dependabot compatibility score

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

  • @dependabot rebase will rebase this PR
  • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
  • @dependabot show <dependency name> ignore conditions will show all of the ignore conditions of the specified dependency
  • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
  • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
  • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

Bumps [html-to-markdown-rs](https://github.com/kreuzberg-dev/html-to-markdown) from 2.23.1 to 2.27.2.
- [Release notes](https://github.com/kreuzberg-dev/html-to-markdown/releases)
- [Changelog](https://github.com/kreuzberg-dev/html-to-markdown/blob/main/CHANGELOG.md)
- [Commits](kreuzberg-dev/html-to-markdown@v2.23.1...v2.27.2)

---
updated-dependencies:
- dependency-name: html-to-markdown-rs
  dependency-version: 2.27.2
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
@dependabot @github
Copy link
Copy Markdown
Contributor Author

dependabot Bot commented on behalf of github Mar 2, 2026

Labels

The following labels could not be found: dependencies, rust. Please create them before Dependabot can add them to a pull request.

Please fix the above issues or remove invalid values from dependabot.yml.

@codecov
Copy link
Copy Markdown

codecov Bot commented Mar 2, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 75.20%. Comparing base (89606bc) to head (3ed9342).
⚠️ Report is 8 commits behind head on main.

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main      #51      +/-   ##
==========================================
+ Coverage   75.16%   75.20%   +0.04%     
==========================================
  Files          67       67              
  Lines        2029     2029              
==========================================
+ Hits         1525     1526       +1     
+ Misses        504      503       -1     
Flag Coverage Δ
unittests 75.20% <ø> (+0.04%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.
see 1 file with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@Sewer56 Sewer56 merged commit e428e77 into main Mar 4, 2026
9 checks passed
@Sewer56 Sewer56 deleted the dependabot/cargo/src/html-to-markdown-rs-2.27.2 branch March 4, 2026 21:20
Sewer56 added a commit that referenced this pull request Mar 30, 2026
…down-rs-2.27.2

Updated:(deps): Bump html-to-markdown-rs from 2.23.1 to 2.27.2 in /src
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant